Dashboard Demonstration



This dashboard shows a handful of data visualization use cases for the Unstructured Data Analytics initiative of the Allegheny County Department of Public Health. This was developed by staff at Chapin Hall and its partners at Argonne National Laboratory.

  • This was made with the free and open-source R programming language
  • Code-driven dashboards like this can be automatically generated to be custom for any given user, and updated with new inputs, analysis, and visualization as part of the full analysis pipeline
  • The code and data used to create this example can be found—and reused and contributed to—at this open-source GitHub repository
  • Note that all data here is fake, intended only for the purpose of illustration

Case-Level Visualization


  • This figure combines
    • text-mined sentiment score
    • text-mined key topics of relevance
    • indication of institutional contacts, including DHS investigation and intervention
  • Low sentiment scores could be used to trigger specific interventions or referrals to other practitioners, when combined with key topic combinations
  • Topic and sentiment scores can help supervisors quickly review timely characteristics of individual cases or a worker’s entire caseload
  • Persistence of key topics can raise flags to QA teams to complete additional modules, e.g. related to housing concerns

Interactive Topic Visualization


A simulation of an interactive topic prevalence+trend figure for the case-level visualization

Macro Planning – Client Query Table


Both sentiment and topical scores can be used for macro planning related to:

  • Volume – who are the children, youth, and families who would best benefit from a new intervention
  • Needs – what are the classes of service need within this population
  • Distribution – where is the service need located
  • Service Matching – how do we match families to the most appropriate services

For example, this type of table allows a user to filter clients by geographic regions (or other case characteristics), select those with combinations of needs, sort by sentiment, and export this information to share with other role-players.

Macro Planning – Topic Scores by Geography


Both sentiment and topical scores can be used for macro planning related to:

  • Volume – who are the children, youth, and families who would best benefit from a new intervention
  • Needs – what are the classes of service need within this population
  • Distribution – where is the service need located
  • Service Matching – how do we match families to the most appropriate services

This map identifies geographies with the greatest need for attention by topic. Users are able to pan, zoom, and click for pop-up annotations with additional contextual information.

This helps identify volume and type of needs, along with distribution. With additional layers showing the distribution of service facitilies it can be possible to also see how demand and supply of services can be matched.

Macro Planning – Trajectory Visualization and Type Identifiation


  • Trajectory visualizations can show pathways of cases through institutional contacts
  • Each individual trajectory is shown as a (narrow) horizontal line. Each subplot shows hundreds of individual cases, with cases sorted by their starting state, then the type and duration of the state that follows.
  • These plots show the starting age, duration, sequence, and number of spells for different institutional contacts
  • The multiple figures show different distinct cluster patterns. In this example:
    • “Cluster 1” represents individuals with short involvements which start at a range of ages
    • “Cluster 2” represents individuals with long, unbroken stints in child welfare cases. A third of this group has out-of-home placements, almost always at early ages.
    • “Cluster 3”–the smallest group–represents individuals with long stints of out-of home placements, generally starting at early ages.
  • These institutional involvement cluster types can also be displayed in tables or maps

---
title: "ACDHS Dashboard Demo"
author: "Chapin Hall"
date: "August 6, 2018"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    social: menu
    source: embed
    vertical_layout: fill
    logo: img/AGC-triangle-small.png
    css: flexdash.css
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, message = FALSE, warnings = FALSE, error = FALSE)

### Load and--if necessary--install packages ----------------------------------#
package.list <- c("flexdashboard", "data.table", "dplyr", "stringr", "rgdal",
                  "MASS", "plotly", "leaflet", "ggplot2", "scales", "DT")
for (p in package.list){
  if (!p %in% installed.packages()[, "Package"]) install.packages(p)
  library(p, character.only = TRUE)
}

### Load custom functions -----------------------------------------------------#
grepv <- function(p, x, ...) grep(p, x, value = TRUE, ...)
cn <- function(x) colnames(x)

```

### Dashboard Demonstration



This dashboard shows a handful of data visualization use cases for the Unstructured Data Analytics initiative of the Allegheny County Department of Public Health. This was developed by staff at Chapin Hall and its partners at Argonne National Laboratory.

  • This was made with the free and open-source [R programming language](https://www.r-project.org/about.html)
  • Code-driven dashboards like this can be automatically generated to be custom for any given user, and updated with new inputs, analysis, and visualization as part of the full analysis pipeline
  • The code and data used to create this example can be found—and reused and contributed to—at [this open-source GitHub repository](https://github.com/chapinhall/acdhs-demo)
  • **Note** that all data here is fake, intended only for the purpose of illustration
### Case-Level Visualization {data-commentary-width=500} ***
  • This figure combines
    • text-mined sentiment score
    • text-mined key topics of relevance
    • indication of institutional contacts, including DHS investigation and intervention
  • Low sentiment scores could be used to trigger specific interventions or referrals to other practitioners, when combined with key topic combinations
  • Topic and sentiment scores can help supervisors quickly review timely characteristics of individual cases or a worker's entire caseload
  • Persistence of key topics can raise flags to QA teams to complete additional modules, e.g. related to housing concerns
### Interactive Topic Visualization ```{r place interactive topic viz} load("img/interactive_topic_viz.Rda") topic_viz ``` *** A simulation of an interactive topic prevalence+trend figure for the case-level visualization ### Macro Planning -- Client Query Table {data-commentary-width=500} ```{r generate macro planning data} mpn <- 500 Draw1to100 <- function(n, pwr) round(runif(n)^pwr*100, 0) planets <- c("Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Neptune", "Uranus") mp <- data.frame(ID = str_pad(round(runif(mpn)*1e7, 0), width = 7, side = "left", pad = "0"), AQ = paste0(sample(c(2017, 2018), mpn, replace = TRUE), "-Q", sample(1:4, mpn, replace = TRUE)), Region = factor(sample(planets, mpn, replace = TRUE), levels = planets), Sentiment = Draw1to100(mpn, 1.0), MH = Draw1to100(mpn, 2.0), DA = Draw1to100(mpn, 2.0), DV = Draw1to100(mpn, 2.0), H = Draw1to100(mpn, 2.0), E = Draw1to100(mpn, 0.5), M = Draw1to100(mpn, 1.0), SC = Draw1to100(mpn, 0.5)) %>% filter(!AQ %in% c("2018-Q3", "2018-Q4")) ``` ```{r place and format dynamic table} # Develop color scale for quantiles of all top # /!\ An alternative scheme might just be for 1-10, 11-20, etc as hard divisions topics <- c("MH", "DA", "DV", "H", "E", "M", "SC") topic_brks <- quantile(select(mp, one_of(topics)), probs = seq(.05, .95, .05), na.rm = TRUE) # Develop colors as increasing shades of blue topic_clrs <- round(seq(255, 40, length.out = length(topic_brks) + 1), 0) %>% paste0("rgb(", ., ",", ., ",255)") topicNames <- c("Mental Health", "Drug/Alcohol", "Domestic Violence", "Housing", "Education", "Medical", "Social Connection") sent_brks <- quantile(mp$Sentiment, probs = seq(.05, .95, .05), na.rm = TRUE) sent_clrs <- colorRampPalette(c("firebrick2", "forestgreen"))(length(sent_brks) + 1) colnames(mp) <- c("ID", "Assessment Quarter", "Region", "Sentiment", topicNames) rownames(mp) <- NULL datatable(mp, extensions = 'Buttons', filter = 'top', rownames = FALSE, options = list(dom = 'Bfrtip', buttons = list('copy', 'print', list( extend = 'collection', buttons = c('csv', 'excel', 'pdf'), text = 'Download')), pageLength = 20)) %>% formatStyle("Sentiment", backgroundColor = styleInterval(sent_brks, sent_clrs), color = "white") %>% formatStyle(topicNames, backgroundColor = styleInterval(topic_brks, topic_clrs)) ``` *** Both sentiment and topical scores can be used for macro planning related to: * **Volume** -- who are the children, youth, and families who would best benefit from a new intervention * **Needs** -- what are the classes of service need within this population * **Distribution** -- where is the service need located * **Service Matching** -- how do we match families to the most appropriate services For example, this type of table allows a user to filter clients by geographic regions (or other case characteristics), select those with combinations of needs, sort by sentiment, and export this information to share with other role-players. ### Macro Planning -- Topic Scores by Geography {data-commentary-width=500} ```{r display zipcode map with topic layers} # Get zipcode boundary data zips <- readOGR(dsn = "data", layer = "Allegheny_County_Zip_Code_Boundaries", verbose = FALSE) # Sample topics: topics <- c("Pregnant", "Domestic Violence", "Opiates", "Suicide", "Low Income") sigma <- matrix(c(1.0, 0.2, 0.1, 0.2, 0.4, 0.2, 1.0, 0.3, 0.4, 0.5, 0.1, 0.3, 1.0, 0.4, 0.6, 0.2, 0.4, 0.4, 1.0, 0.4, 0.4, 0.5, 0.6, 0.4, 2.0), nrow = 5) mu <- c(0.8, 1.5, 0.0, -1.0, 1.0) # Draw correlated values set.seed(8062018) draws <- mvrnorm(n = nrow(zips@data), mu = mu, Sigma = sigma) # Put this on a 0-100 scale, label, and arrange in order (to give a pattern to matching with zips) draws <- round((draws - min(draws))/(max(draws) - min(draws))*100, 1) colnames(draws) <- topics draws <- arrange(as.data.frame(draws), -`Low Income`) # Attach values to zipcodes zips@data <- cbind(zips@data, draws) %>% within({ topicPop <- paste0("Topic Scores for Zip ", ZIP, ":
", "Pregnant: ", sprintf("%1.1f", Pregnant), "
", "Domestic Violence: ", sprintf("%1.1f", `Domestic Violence`), "
", "Opiates: ", sprintf("%1.1f", Opiates), "
", "Suicide: ", sprintf("%1.1f", Suicide), "
", "Low Income: ", sprintf("%1.1f", `Low Income`), "

", "More Census data on zip code ", ZIP, "") geo_label <- paste0(str_to_title(NAME), ", zip: ", ZIP) }) palBin_blue <- colorBin(palette = "Blues", domain = c(0, 100), bins = 5) myOpacity <- 0.7 leaflet(data = zips) %>% # width = "100%" #addTiles() %>% addProviderTiles("CartoDB.Positron") %>% addPolygons(group = "Pregnant", fillColor = ~palBin_blue(Pregnant), popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity, label=~geo_label, labelOptions= labelOptions(direction = 'auto'), highlightOptions = highlightOptions( color='#ff0000', opacity = 1, weight = 2, fillOpacity = 1, bringToFront = TRUE, sendToBack = TRUE)) %>% addPolygons(group = "Domestic Violence", fillColor = ~palBin_blue(`Domestic Violence`), popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity, label=~geo_label, labelOptions= labelOptions(direction = 'auto'), highlightOptions = highlightOptions( color='#ff0000', opacity = 1, weight = 2, fillOpacity = 1, bringToFront = TRUE, sendToBack = TRUE)) %>% addPolygons(group = "Opiates", fillColor = ~palBin_blue(Opiates), popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity, label=~geo_label, labelOptions= labelOptions(direction = 'auto'), highlightOptions = highlightOptions( color='#ff0000', opacity = 1, weight = 2, fillOpacity = 1, bringToFront = TRUE, sendToBack = TRUE)) %>% addPolygons(group = "Suicide", fillColor = ~palBin_blue(Suicide), popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity, label=~geo_label, labelOptions= labelOptions(direction = 'auto'), highlightOptions = highlightOptions( color='#ff0000', opacity = 1, weight = 2, fillOpacity = 1, bringToFront = TRUE, sendToBack = TRUE)) %>% addPolygons(group = "Low Income", fillColor = ~palBin_blue(`Low Income`), popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity, label=~geo_label, labelOptions= labelOptions(direction = 'auto'), highlightOptions = highlightOptions( color='#ff0000', opacity = 1, weight = 2, fillOpacity = 1, bringToFront = TRUE, sendToBack = TRUE)) %>% addLayersControl(baseGroups = c("Pregnant", "Domestic Violence", "Opiates", "Suicide", "Low Income"), options = layersControlOptions(collapsed = FALSE)) %>% addLegend(position = "bottomleft", colors = unique(palBin_blue(0:100)), title = "Topic Scores", labels = c("0-20", "20-40", "40-60", "60-80", "80-100")) ``` *** Both sentiment and topical scores can be used for macro planning related to: * **Volume** -- who are the children, youth, and families who would best benefit from a new intervention * **Needs** -- what are the classes of service need within this population * **Distribution** -- where is the service need located * **Service Matching** -- how do we match families to the most appropriate services This map identifies geographies with the greatest need for attention by topic. Users are able to pan, zoom, and click for pop-up annotations with additional contextual information. This helps identify volume and type of needs, along with distribution. With additional layers showing the distribution of service facitilies it can be possible to also see how demand and supply of services can be matched. ### Macro Planning -- Trajectory Visualization and Type Identifiation {data-commentary-width=500} ```{r generate prevalence figure to match trajectory viz} df <- data.frame(clus = c("Cluster 1", "Cluster 2", "Cluster 3"), freq = c(5442, 679, 227)) %>% mutate(pct = freq/sum(freq), label = paste0("n = ", prettyNum(freq, big.mark = ","))) myplot <- ggplot(df, aes(x = clus, y = pct, fill = clus)) + geom_bar(stat = "identity", position = "dodge") + geom_text(aes(label = label), vjust = 1) + scale_y_continuous(labels = percent) + labs(title = "Prevalence of Cluster Types in Population", x = "", y = "% of Population") + theme(legend.position = "none", axis.title = element_text(size = 12), axis.text = element_text(size = 11)) ggsave("img/dually-involved-youth_traj-viz_3-cluster--prevalence.png") ``` ***
  • Trajectory visualizations can show pathways of cases through institutional contacts
  • Each individual trajectory is shown as a (narrow) horizontal line. Each subplot shows hundreds of individual cases, with cases sorted by their starting state, then the type and duration of the state that follows.
  • These plots show the starting age, duration, sequence, and number of spells for different institutional contacts
  • The multiple figures show different distinct cluster patterns. In this example:
    • "Cluster 1" represents individuals with short involvements which start at a range of ages
    • "Cluster 2" represents individuals with long, unbroken stints in child welfare cases. A third of this group has out-of-home placements, almost always at early ages.
    • "Cluster 3"--the smallest group--represents individuals with long stints of out-of home placements, generally starting at early ages.
  • These institutional involvement cluster types can also be displayed in tables or maps